Enterprise Applications
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > California (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.52)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
- Education > Educational Setting > Online (0.41)
- Energy (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)
Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach
In this paper, we propose an online convex optimization approach with two different levels of adaptivity. On a higher level, our approach is agnostic to the unknown types and curvatures of the online functions, while at a lower level, it can exploit the unknown niceness of the environments and attain problem-dependent guarantees.
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Salesforce Workers Circulate Open Letter Urging CEO Marc Benioff to Denounce ICE
The letter comes after Benioff joked at a company event on Monday that ICE was monitoring international employees in attendance, sparking immediate backlash. Employees at Salesforce are circulating an internal letter to chief executive Marc Benioff calling on him to denounce recent actions by US Immigration and Customs Enforcement, prohibit the use of Salesforce software by immigration agents, and back federal legislation that would significantly reform the agency. The letter specifically cites the "recent killings of Renee Good and Alex Pretti in Minneapolis" as catalysts, calling them the "devastating indictment of a system that has discarded human decency." It's unclear how many signatories the letter has received so far. The letter, which has not been reported on previously, is being organized amid Salesforce's annual leadership kickoff event this week in Las Vegas.
- North America > United States > Nevada > Clark County > Las Vegas (0.25)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.25)
- Europe > Italy (0.15)
- (3 more...)
- Information Technology > Enterprise Applications > Customer Relationship Management (1.00)
- Information Technology > Artificial Intelligence (0.75)
Partial Feedback Online Learning
Shao, Shihao, Fang, Cong, Lin, Zhouchen, Tao, Dacheng
We study partial-feedback online learning, where each instance admits a set of correct labels, but the learner only observes one correct label per round; any prediction within the correct set is counted as correct. This model captures settings such as language generation, where multiple responses may be valid but data provide only a single reference. We give a near-complete characterization of minimax regret for both deterministic and randomized learners in the set-realizable regime, i.e., in the regime where sublinear regret is generally attainable. For deterministic learners, we introduce the Partial-Feedback Littlestone dimension (PFLdim) and show it precisely governs learnability and minimax regret; technically, PFLdim cannot be defined via the standard version space, requiring a new collection version space viewpoint and an auxiliary dimension used only in the proof. We further develop the Partial-Feedback Measure Shattering dimension (PMSdim) to obtain tight bounds for randomized learners. We identify broad conditions ensuring inseparability between deterministic and randomized learnability (e.g., finite Helly number or nested-inclusion label structure), and extend the argument to set-valued online learning, resolving an open question of Raman et al. [2024b]. Finally, we show a sharp separation from weaker realistic and agnostic variants: outside set realizability, the problem can become information-theoretically intractable, with linear regret possible even for $|H|=2$. This highlights the need for fundamentally new, noise-sensitive complexity measures to meaningfully characterize learnability beyond set realizability.
- Research Report (0.40)
- Overview (0.34)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.82)
- Information Technology > Artificial Intelligence > Natural Language (0.65)
Statistical Reinforcement Learning in the Real World: A Survey of Challenges and Future Directions
Gazi, Asim H., Guo, Yongyi, Gao, Daiqi, Xu, Ziping, Zhang, Kelly W., Murphy, Susan A.
Reinforcement learning (RL) has achieved remarkable success in real-world decision-making across diverse domains, including gaming, robotics, online advertising, public health, and natural language processing. Despite these advances, a substantial gap remains between RL research and its deployment in many practical settings. Two recurring challenges often underlie this gap. First, many settings offer limited opportunity for the agent to interact extensively with the target environment due to practical constraints. Second, many target environments often undergo substantial changes, requiring redesign and redeployment of RL systems (e.g., advancements in science and technology that change the landscape of healthcare delivery). Addressing these challenges and bridging the gap between basic research and application requires theory and methodology that directly inform the design, implementation, and continual improvement of RL systems in real-world settings. In this paper, we frame the application of RL in practice as a three-component process: (i) online learning and optimization during deployment, (ii) post- or between-deployment offline analyses, and (iii) repeated cycles of deployment and redeployment to continually improve the RL system. We provide a narrative review of recent advances in statistical RL that address these components, including methods for maximizing data utility for between-deployment inference, enhancing sample efficiency for online learning within-deployment, and designing sequences of deployments for continual improvement. We also outline future research directions in statistical RL that are use-inspired -- aiming for impactful application of RL in practice.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > North Carolina (0.04)
- (2 more...)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Instructional Material (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Energy (1.00)
- Education > Educational Setting > Online (1.00)
- (3 more...)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- (3 more...)
'No reasons to own': Software stocks sink on fear of new AI tool
'No reasons to own': Software stocks sink on fear of new AI tool The new year was supposed to bring opportunities for beaten-down software stocks. Instead, the group is off to its worst start in years. The release of a new artificial intelligence tool from startup Anthropic on Jan. 12 rekindled fears about disruption that weighed on software makers in 2025. TurboTax owner Intuit tumbled 16% last week, its worst since 2022, while Adobe and Salesforce, which makes customer relationship management software, both sank more than 11%. All told, a group of software-as-a-service stocks tracked by Morgan Stanley is down 15% so far this year, following a drop of 11% in 2025.
- Asia > Middle East > Iran (0.43)
- Asia > China (0.41)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.07)
- (3 more...)
- Information Technology > Software (1.00)
- Government > Foreign Policy (1.00)
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Communications > Social Media (0.79)
- Information Technology > Enterprise Applications > Customer Relationship Management (0.56)
Online Learning with Limited Information in the Sliding Window Model
Braverman, Vladimir, Garg, Sumegha, Wang, Chen, Woodruff, David P., Zhou, Samson
Motivated by recent work on the experts problem in the streaming model, we consider the experts problem in the sliding window model. The sliding window model is a well-studied model that captures applications such as traffic monitoring, epidemic tracking, and automated trading, where recent information is more valuable than older data. Formally, we have $n$ experts, $T$ days, the ability to query the predictions of $q$ experts on each day, a limited amount of memory, and should achieve the (near-)optimal regret $\sqrt{nW}\text{polylog}(nT)$ regret over any window of the last $W$ days. While it is impossible to achieve such regret with $1$ query, we show that with $2$ queries we can achieve such regret and with only $\text{polylog}(nT)$ bits of memory. Not only are our algorithms optimal for sliding windows, but we also show for every interval $\mathcal{I}$ of days that we achieve $\sqrt{n|\mathcal{I}|}\text{polylog}(nT)$ regret with $2$ queries and only $\text{polylog}(nT)$ bits of memory, providing an exponential improvement on the memory of previous interval regret algorithms. Building upon these techniques, we address the bandit problem in data streams, where $q=1$, achieving $n T^{2/3}\text{polylog}(T)$ regret with $\text{polylog}(nT)$ memory, which is the first sublinear regret in the streaming model in the bandit setting with polylogarithmic memory; this can be further improved to the optimal $\mathcal{O}(\sqrt{nT})$ regret if the best expert's losses are in a random order.
- Europe > Austria > Vienna (0.14)
- North America > United States > Texas (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (3 more...)
- Information Technology > Security & Privacy (1.00)
- Education > Educational Setting > Online (0.51)
Fully Unconstrained Online Learning
Importantly, this matches the optimal bound $G\|w_\star\|\sqrt{T}$ available with such knowledge (up to logarithmic factors), unless either $\|w_\star\|$ or $G$ is so large that even $G\|w_\star\|\sqrt{T}$ is roughly linear in $T$. Thus, at a high level it matches the optimal bound in all cases in which one can achieve sublinear regret.
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.45)
- Information Technology > Artificial Intelligence > Machine Learning (0.45)
Online Learning of Delayed Choices
Choice models are essential for understanding decision-making processes in domains like online advertising, product recommendations, and assortment optimization. The Multinomial Logit (MNL) model is particularly versatile in selecting products or advertisements for display. However, challenges arise with unknown MNL parameters and delayed feedback, requiring sellers to learn customers' choice behavior and make dynamic decisions with biased knowledge due to delays. We address these challenges by developing an algorithm that handles delayed feedback, balancing exploration and exploitation using confidence bounds and optimism. We first consider a censored setting where a threshold for considering feedback is imposed by business requirements. Our algorithm demonstrates a $\tilde{O}(\sqrt{NT})$ regret, with a matching lower bound up to a logarithmic term. Furthermore, we extend our analysis to environments with non-thresholded delays, achieving a $\tilde{O}(\sqrt{NT})$ regret. To validate our approach, we conduct experiments that confirm the effectiveness of our algorithm.
- Marketing (0.61)
- Education > Educational Setting > Online (0.44)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.44)
- Information Technology > Artificial Intelligence > Machine Learning (0.44)